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1. INTRODUCTION 

Power transformer is one of the important equipment in power system. The fault in transformer 
causes breakdown in power system which produces financial losses to power industry and inconvenience to 
the end user. In power transformers, liquid insulation in the form of mineral oil/transformer oil is being used 
as cooling agent. An impregnated insulation cellulose/paper is also used as solid insulation in transformer. 
Transformer oil as liquid insulation is very important as it provides electrical insulation, dissipates heat as 
cooling agent, protect the core & winding and does isolation and moreover, prevent direct contact of 
atmospheric oxygen with winding. 

Paper insulation of winding deteriorate with time of usage which results in deterioration of solid 
insulation [1], [2], [3]. The liquid insulation (transformer oil) when heated up due to working of transformer, 
decomposes and produce gases like hydrogen (H2), methane (CH4), acetylene (C2H2), ethylene (C2H4) and 
ethane (C2H6). These gases deteriorate the quality of transformer oil and further its properties as coolant and 
insulator are affected which may result in breakdown of transformer as equipment in power supply. This can 
be prevented by knowing the amount of gases dissolved in the transformer oil at regular intervals of time of 
usage. The conventional methods like Roger’s ratio method, Dornenburg’s method, Duval’s triangle method 
and key gas ratio methods are used to find the fault in respect of amount of harmful gases dissolved in the 
transformer oil. But, these methods sometimes give a false fault type [4], [5]. To improve these anomalies in 
conventional methods, various software based intelligent methods such as artificial neural 
networks [6], [7], [8], [9], [11], Wavelet Analysis, Least Vector Quotient, Probabilistic Neural Network 
(PNN), fuzzy logic, Support Vector Machine classifiers and Self-Organizing Map classifiers have been 
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proposed to find faults in transformers, [12], [13], [14], [15], [16]. As this is diagnosis process so neural 
network can be applied successfully [18]. 

This paper proposes a new ANN (Artificial neural network) based algorithm Random Neural 
Network to find faults in power transformers using BFGS (Broyden-Fletcher-Goldfarb-Shanno) and LM 
(Levenberg-Marquardt). Further, linear regression and Bland Altman techniques are used to compare BFGS 
and LM algorithm. Also, Receiver Operating Characteristic (ROC) curve is used to validate the results. It has 
been found out that BFGS outperform the LM algorithm. 


2. FAULT CLASSIFICATION ALGORITHM APPROACH 

Following are the steps implemented for the proposed transformer fault classification scheme. 
The raw data are collected from PSTCL (Punjab State Transmission Corporation Ltd.) labs and preprocessed. 
After that feature selection is performed so that all faults are covered in the data selected and in the last step 
of the classification, ANN (Artificial neural network) based classifiers have been applied to determine 
different faults. 


3. PROBLEM FORMULATION 
3.1. Dissolved gas analysis (DGA) 

The DGA (Dissolved Gas Analysis) is the most common techniques used for incipient fault 
diagnosis. Oil samples are collected to perform DGA and hence gas amount in the oil sample. The level of 
gases generated in oil-filled transformer provides the first level information for fault detection in transformer 
based on various conventional methods. Faults in oil-filled transformers can be found out according to the 
amount and type of gases generated These gases are hydrogen (H2), methane (CH4), ethylene (C2H4), ethane 
(C2H6), acetylene (C2H2), carbon monoxide (CO), and carbon dioxide (CO2). Various conventional methods 
generally based on defined principles such as gas concentrations, key gases, key gas ratios, and graphical 
representations. Under IEEE Standard C57.104- 2008 Key Gas Analysis, Dornenberg and Rogers Ratio 
Methods, Nomograph, IEC Ratio, Duval Triangle, and CIGRE Method are listed to find out faults in 
transformers. The DGA can find faults such as partial discharge (corona), overheating, and arcing in many 
different power transformers. Like a blood test in human body, DGA can provide the early diagnosis to find 
incipient faults and increase the chance of finding an appropriate maintenance schedule or repair if required. 
Table 1 shows different faults of power transformer as given by IEC/IEEE. This will be used to train RNN 
neural network to find fault in transformers. Currently seven methods based on dissolved gas data are used to 
diagnosis types of faults: 

a. Key Gas Method, 

b. Dornenburg Ratio Method, 
c. Rogers Ratio Method, 

d. Nomograph Method, 

e. IEC Ratio Method, 

f. Duval Triangle Method and 
g. CIGRE Method. 


Table 1. Faults in Power Transformers 


S No Type of Fault Short Name Code used for fault 
1 Partial discharge PD 0 
2 Partial discharge with low energy density D1 1 
3 Partial discharge with High energy density D2 2 
4 Thermal fault with temp. less than 300° C T1 3 
5 Thermal fault with temp. between 300° C to 700° C T2 4 
6 Thermal fault with temp. greater than 700° C T3 5 


3.2. Data collection 

The gas samples of transformers from various substation of Punjab State Transmission Corporation 
Ltd (PSTCL), Ludhiana have been used as data for analysis. The data is collected as per the American 
Society for Testing and Materials (ASTM) standards. After the data collection, data is processed by removing 
linear trends, outliers, etc. Table 2 shows the data of samples obtained from PSTCL situated in 
Ludhiana.Total data collected from laboratory are 700 samples. After processing 600 samples are selected 
and100 sample per fault are used as input data. For reference only 5 samples are shown in Table 2. 
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Table 2 Samples of Data of Dissolved Gases in Powertransformers 


Sample H2 CH4 C2H6 C2H4 C2H2 
No. ppm ppm ppm ppm ppm 
1: 95 10 0 11 39 

2. 150 130 55 9 30 

3. 75.03 21.99 21.86 132.99 5.56 
4 84.72 2.54 1.05 3.52 1.71 
5 150 22 9 16 11 


4. RANDOM NUERAL NETWORK 

Random Neural Networks (RNNs) are a type of Artificial Neural Networks (ANNs) that could also 
be specified as type of queuing network. The information is presented as a dataset of labeled samples. 
The aim is “to learn” the relationship between input and output features. This learning process is done based 
on a set of examples in order to generate a learning model with the power of “generalising”, this is to make 
“good” predictions for new unseen inputs. This procedure is based on the classical back propagation 
algorithm [11]. As in practice, the input and output variables in learning problems are bounded with known 
bounds, the algorithm described in [12] assumes that a (k) € [0..1]I and b (k) € [0..1]O, for all sample k. 
The RNN model as a predictor is a parametric mapping v(a, w+, w—, L), where the parameters w+ and w— 
are adjusted minimizing the loss function. The network architecture is defined with I input nodes and O 
output nodes. There are no additional constraints regarding the network topology that means the network can 
be feedforward with one or several layers, or it can be recurrent network. We set the port of the input neurons 
each time that an input pattern a (k) is offered to the network. The inputs to the positive ports are set with the 
input pattern: A+i=a (k) i; the negative ports of input neurons are conventionally set to zero (A-i=0). 
The output of the model is a vector of the activity rates produced by the output neurons. The adjustable 
parameters of the mapping are the weights connections among the neurons. RNN is implemented using two 
algorithm namely LM (Levenberg-Marquardt )and BFGS(Broyden-F letcher-Goldfarb-Shanno) algorithms. 


5. RESULTS AND DISCUSSION 
In LM algorithm, at each epoch p, the approximation of the Hessian matrix is given by (1). 


H =H" *«H (1) 


Where H is hessian matrix. The dumping factor is modified at each epoch. In this case the calculated error E2 
decreases, then the dumping factor n is calculated by some constant value B known as dumping constant 
a <— n/B. Otherwise, the dumping value is increased by a factor of B, a— n B. 

The 600 data samples collected from PSTCL lab were used for training the RNN with 100 samples 
for each fault. After 500 iterations error is reduced to 0.454 from 1038.21. The dumping constant for 
modifying the dumping factor which is taken as 3, and B=10. The LM algorithm calculate the weight 
correction as 


éw=-Gxg (2) 


where g is gradient of input funcion and G is inverse of Hessian matrix. Then, the weights are updated by dw. 
Error plot as given in Figure 1. 

BFGS (Broyden-Fletcher-Goldfarb-Shannon) Algorithm: The Broyden-Fletcher-Goldfarb-Shanno 
(BFGS) algorithm for the RNN model was proposed in 2000 by Likas and Stafylopatis [16]. It is an offline 
algorithm. At each epoch u approximate Hessian matrix He (t) is computed. The method starts with Positive 
definite Hessian matrix, then gradient and cost function are intialised from start point. The weight correction 
is calculated by using same equation (2) as used for Levenberg-Marquardt function. Gradient and cost 
function is calculated and change in gradient function is given by the equation. 


ôg =9° -g (3) 


Using BFGS algorithm, after 500 iterations error is reduced to 0.166 from 1038.21. Error graph for BFGS is 
as shown in Figure 2. 
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Figure 1. Error plot for LM algorithm 


Figure 2. Error plot for BFGS algorithm 


After training both the LM and BFGS algorithms were tested for 300 samples with 50 samples per 
fault. The linear regression and Bland Altman plot are used to compare clinical methods for diagnosis. In 
linear regression case plot between actual output and output from the proposed algorithm is plotted and gives 


coefficient of Determination r? and SSE denotes the Sum of square error [8]. It is an approach to model the 


relationship between actual output and output from the proposed algorithm. Regression minimizes the sum of 
squared differences from each point to the fitted line (or curve). 


Bland-Altman plot [9] is a plot for matched pairs analysis which shows the relationship between the 


differences of actual and predicted output from proposed method versus the means of the these two outputs. 
RPC denotes reproducibility coefficient, CV is the coefficient of variation. Figure 3 and Figure 4 shows both 
the plots for LM and BFGS algorithms respectively. 
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Figure 3. (a) Linear regression; (b) Bland altman plot of LM 
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Figure 4. (a) Linear regression; (b) Bland altman plot of BFGS 


A confusion matrix, also known as an error matrix is a specific table that is used to visualize the 
performance of an algorithm, it is usually called a matching matrix. Each row of the matrix represents the 
output by proposed algorithm while each column represents the actual output. Figure 5 and Figure 6 shows 
the confusion matrix for Random Neural Network to classify different transformer fault by using LM and 
BFGS algorithms. The accuracy obtained from these algorithms are 99.33% for BFGS and 94.66% for LM. 
These accuracies are 95.6 % for Probabilistic Neural Network classifier and 93.6 % for Backpropagation 
Network classifier (Levenberg—Marquardt Method) [5]. 

For further validation of results ROC curves [10] are plotted for both the algorithms and shown in 
Figure 7 and Figure 8. The Receiver Operating Characteristic (ROC) curve is a plot of the true positive rate 
against the false positive rate for the different possible cutpoints of a diagnostic test. AUC (Area under 
Curve) is equal to the probability that an algorithm will rank a randomly chosen positive instance higher than 
a randomly chosen negative one and is measure of test accuracy. Area under the ROC curve is 80.6% for LM 
algorithm while for BFGs it is 80.3 %. Table 3 shows the comparison of both algorithm based on sensitivity 
and specificity. 
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Figure 5. Confusion matrix for LM algorithm Figure 6. Confusion matrix for BFGS algorithm 


Table 3. Comparison of BFGS and LM Training Algorithm for RNN 


Algorithm Recognition Rate Sensitivity Specificity 
LM 95% 0.955 0.99 
BFGS 99% 0.993 1 
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Figure 7. ROC for LM Figure 8. ROC for BFGS 


5. CONCLUSIONS 


fault 


RNN based Fault Diagnosis method (intelligent method based on AI techniques) is projected for 
recognition in Power Transformers. Based upon the results attained, it is concluded that 


finding/detection of faults in Power Transformers could be efficiently done using this intelligent method. In 
this paper, RNN is executed by using two different algorithms i.e. LM and BFGS. Both of these algorithms 
are compared by using various techniques like Regression Plots, Confusion Matrix and ROC Curve (proven 
methods to corroborate any diagnosis test/algorithm). It is also distinguished that results from BFGS is 
augmented in comparison to LM algorithm. Moreover, memory requirement for BFGS is also less as 
compared to LM algorithm as BFGS is a Quasi-Newton method, and will converge in fewer steps [19]. 
Hence, proposed RNN with BFGS algorithm could effectually be implemented to diagnose fault in Power 
Transformers. 
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